Analyzing the Influence of Parsing Errors on Pre-reordering Performance for SMT

نویسندگان

  • Dan Han
  • Pascual Martínez-Gómez
  • Yusuke Miyao
  • Katsuhito Sudoh
  • Masaaki Nagata
چکیده

Word alignment for long distance language pairs is problematic in state-of-the-art phrasebased statistical machine translation. Linguistically motivated reordering models have been widely studied to conquer this challenge. One of the most popular and effective methods is called pre-reordering, where words in sentences from the source language are re-arranged with the objective to resemble the word order of the target language. There are mainly two ways to formulate re-arranging rules. One is to learn automatically from the data (Xia and McCord, 2004; Genzel, 2010); while another one is to handcraft reordering rules based on linguistic studies (Isozaki et al., 2010; Han et al., 2012; Han et al., 2013). In both methods, syntactic information are obtained by using automatic parsers. However, although these parsers produce parsing errors, current reordering methods do not include any parsing error identification or correction mechanism. In order to improve the robustness of pre-reordering method, it is useful to explore the relationship between parsing and reordering. In this work, we use both empirical and descriptive approaches to analyze the effects of parsing errors on pre-reordering performance for Chinese-to-Japanese statistical machine translation. We examine the impact of parsing errors that are produced by a dependency parser called Corbit1 (Hatori et al., 2011), on a pre-reordering framework called unlabeled dependency parsing

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effects of Parsing Errors on Pre-Reordering Performance for Chinese-to-Japanese SMT

Linguistically motivated reordering methods have been developed to improve word alignment especially for Statistical Machine Translation (SMT) on long distance language pairs. However, since they highly rely on the parsing accuracy, it is useful to explore the relationship between parsing and reordering. For Chinese-toJapanese SMT, we carry out a three-stage incremental comparative analysis to ...

متن کامل

Phrase Reordering Model Integrating Syntactic Knowledge for SMT

Reordering model is important for the statistical machine translation (SMT). Current phrase-based SMT technologies are good at capturing local reordering but not global reordering. This paper introduces syntactic knowledge to improve global reordering capability of SMT system. Syntactic knowledge such as boundary words, POS information and dependencies is used to guide phrase reordering. Not on...

متن کامل

Weblio Pre-reordering Statistical Machine Translation System

This paper describes details of the Weblio Pre-reordering Statistical Machine Translation (SMT) System, participated in the English-Japanese translation task of 1st Workshop on Asian Translation (WAT2014). In this system, we applied the pre-reordering method described in (Zhu et al., 2014), and extended the model to obtain N -best pre-reordering results. We also utilized N -best parse trees sim...

متن کامل

Long-distance reordering during search for hierarchical phrase-based SMT

Long-distance reordering of syntactically divergent language pairs is a critical problem. SMT has had limited success in handling these reorderings during inference, and thus deterministic preprocessing based on reordering parse trees is used. We consider German-to-English translation using Hiero. We show how to effectively model long-distance reorderings during search. Our work is novel in tha...

متن کامل

Improving Statistical Machine Translation with Processing Shallow Parsing

Reordering is of essential importance for phrase based statistical machine translation (SMT). In this paper, we would like to present a new method of reordering in phrase based SMT. We inspired from (Xia and McCord, 2004) using preprocessing reordering approaches. We used shallow parsing and transformation rules to reorder the source sentence. The experiment results from English-Vietnamese pair...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014